The STARK Framework for Spatio-Temporal Data Analytics on Spark

نویسندگان

  • Stefan Hagedorn
  • Philipp Götze
  • Kai-Uwe Sattler
چکیده

Big Data sets can contain all types of information: from server log files to tracking information of mobile users with their location at a point in time. Apache Spark has been widely accepted for Big Data analytics because of its very fast processing model. However, Spark has no native support for spatial or spatio-temporal data. Spatial filters or joins using, e.g., a contains predicate are not supported and would have to be implemented inefficiently by the users. Also, Spark cannot make use of, e.g., spatial distribution for optimal partitioning. Here we present our STARK framework that adds spatio-temporal support to Spark. It includes spatial partitioners, different modes for indexing, as well as filter, join, and clustering operators. In contrast to existing solutions, STARK integrates seamlessly into any (Scala) Spark program and provides more flexible and comprehensive operators. Furthermore, our experimental evaluation shows that our implementation outperforms existing solutions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient spatio-temporal event processing with STARK

For Big Data processing, Apache Spark has been widely accepted. However, when dealing with events or any other spatio-temporal data sets, Spark becomes very inefficient as it does not include any spatial or temporal data types and operators. In this paper we demonstrate our STARK project that adds the required data types and operators, such as spatio-temporal filter and join with various predic...

متن کامل

A Framework for Scalable Correlation of Spatio-temporal Event Data

Spatio-temporal event data do not only arise from sensor readings, but also in information retrieval and text analysis. However, such events extracted from a text corpus may be imprecise in both dimensions. In this paper we focus on the task of event correlation, i.e., finding events that are similar in terms of space and time. We present a framework for Apache Spark that provides correlation o...

متن کامل

A Framework for Using Self-Organizing Maps to Analyze Spatio- Temporal Patterns, Exemplified by Analysis of Mobile Phone Usage

We suggest a visual analytics framework for the exploration and analysis of spatially and temporally referenced values of numeric attributes. The framework supports two complementary perspectives on spatio-temporal data: as a temporal sequence of spatial distributions of attribute values (called spatial situations) and as a set of spatially referenced time series of attribute values representin...

متن کامل

Simplification and Refinement for Speedy Spatio-temporal Hot Spot Detection Using Spark

This paper describes a spatio-temporal hot spot identification program submitted to the ACM SIGSPATIAL Cup 2016. The advent of large-scale spatio-temporal data (e.g., vehicle tracking data), together with the availability of inmemory distributed computing framework (i.e., Spark), provides an opportunity to quickly identify unusual patterns in a statistically manner, also called hot spots. We pr...

متن کامل

Simplification and Refinement for Speedy Spatio - temporal Hot Spot Detection Using Spark ( GIS Cup ) ∗

This paper describes a spatio-temporal hot spot identification program submitted to the ACM SIGSPATIAL Cup 2016. The advent of large-scale spatio-temporal data (e.g., vehicle tracking data), together with the availability of inmemory distributed computing framework (i.e., Spark), provides an opportunity to quickly identify unusual patterns in a statistically manner, also called hot spots. We pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017